The following content has been provided by the University of Erlangen-Nürnberg.
We are now in the final. We will discuss two major topics. One is we will talk about model
assessment, bias-variance trade-off. That's something I'm going to discuss with you today.
Next week I will be in China again. So either Stefan or Andreas will have two lectures on
AdaBoost, which is one of the most powerful classifiers that currently exists. And this
algorithm is basically using the idea that the loss function is no longer a zero-one
loss function, but an exponential loss function, which has no discontinuities, which is differentiable.
And then you can try to find, like we did for the Bayesian classifier, a classifier
that minimizes the average loss using the exponential loss function, and you end up
with the AdaBoost classifier that is widely used in object tracking, for instance, or
in many other practical applications of pattern recognition technology in the real world.
Also, retrospectively, considering the contents of the lecture, I think you got a pretty consistent
overview of different ways to construct decision boundaries and to understand the value of
a certain decision boundary, the pros and cons. And in summer semester, my colleague
Professor Nöth, he will go way more in details how pattern recognition technology can be
used to analyze more complex patterns, like spoken language recognition, dialogue systems,
image analysis problems, and things like that. Good. So let's consider the topics of today.
So far we have seen 100 ways of constructing decision boundaries, and now the question
is which one is the best one? That's something we have always to face. And for particular
problems, particular classifiers are doing the best job. The question is, is there a
classifier that is performing in an excellent manner for all possible applications? And
the answer is no. It's like Heisenberg's uncertainty assumption. You cannot have both
things at the same time. So there is no free lunch, basically. So if you want something,
you have to pay for it. All classifiers have some drawbacks. We will talk a little bit
about offset training errors. So offset training error basically means what is the classification
error you gain if you plug in or if you put into the classifier observations that are
not being considered within the training phase. So we are considering, and that's the term
I have used in previous lectures, we consider the generalization capabilities of a classifier.
How does the classifier generalize? How does the classifier behave with feature vectors
that are not considered as part of the training set? And it's very important. If you do classification,
experimentation, or experiments with classifiers, you should not use your training data to check
how well the classifier is performing. If you are doing a proper job and you use your training
data, your classifier should work with a 100% recognition rate. And for me and for many
other researchers, if somebody has experiments with a 100% recognition rate, classification
rate, you can be quite sure that something went wrong. And the right answer to ask at
this point is, is your training set
different from your test set? Are these two sets disjoint?
And quite often you will hear the answer, oh, you know, I had problems in getting the data,
and I'm so happy about the data I have, so I did the training and the testing using the same data,
and then you basically overfit the whole thing
Then we will talk about one thing that is very, very crucial. That's the bias-variance trade-off.
Bias is a systematic error, and the variance is how
things jump back and forth. For instance, if you want to compute a decision boundary,
you can look at the variance. If you take different classifiers, they can change a lot
for the same training data, or you can have a systematic error, for instance, for a linear
decision boundary, a systematic error could be that you have an offset. That's the bias.
And it's hard or nearly impossible to reduce both bias and variance. Either you have a
huge systematic error and small variations. That's true for linear classifiers with a
linear decision boundary. They do not change so much if you change a little bit the data,
for instance, but they have a huge systematic error if you have the situation that you are
Presenters
Zugänglich über
Offener Zugang
Dauer
01:25:24 Min
Aufnahmedatum
2013-01-28
Hochgeladen am
2013-01-30 07:44:30
Sprache
en-US